Prepare dataset

The raw dataset has 613 libs GAV with information of the following variables:

##  [1] "Size_original_jar"            "Size_debloat_jar"            
##  [3] "Nb_classes_original"          "Nb_methods_original"         
##  [5] "Nb_classes_debloated"         "Nb_methods_debloated"        
##  [7] "Lib_coverage"                 "Client_original_test_error"  
##  [9] "Client_original_test_failing" "Client_original_test_passing"
## [11] "Client_debloat_test_error"    "Client_debloat_test_failing" 
## [13] "Client_debloat_test_passing"  "Client_coverage"             
## [15] "Cover_lib"                    "Lib"                         
## [17] "Lib_gav"                      "Client"

We exclude the libs with Size_debloat_jar == 0 and Nb_classes_original == 0, this results in a dataset with 72 diferent libs and 468 libs GAV

Distribution of clients per library

The lib commons-io:commons-io:2.4 has the larger number of clients with 874, followed by commons-cli:commons-cli:1.2 with 279, and commons-codec:commons-codec:1.10 with 257.

Library plots

Size of the original vs size debloated

The size of the debloated jar is slightly bigger than the original one. It seems that JDBL is adding some dependencies or resourses that should’t be included in the bundled jar. We need to invesitigate this in more details.

Let’s see wich are those libraries for which the debloated jar is bigger than the original one.

Let’s see plot the difference in size between the debloatead and the original JARs.

Number of classes in the original vs number classes debloated

Number of methods in the original vs number of methods in the debloated

Percentage of methods reduced per library.

## Warning: Removed 114 rows containing missing values (geom_bar).

There are some libs, such as DiUS_java-faker, for which the #Methods is equal zero

Lib coverage

Client plots

Client Test results original vs test debloated

## [1] "#Clients original with at least one test error: "
## [1] 7050
## [1] "#Clients debloated with at least one test error: "
## [1] 738
## [1] "#Clients original with zero test error: "
## [1] 0
## [1] "#Clients debloated with zero test error: "
## [1] 6312
## [1] "#Clients debloated with zero test error and at least one debloated test error: "
## [1] 0
## [1] "#Clients debloated wigh zero test error and at least one debloated test error: "
## [1] 7050
## [1] "#Clients original with at least one test pass: "
## [1] 282
## [1] "#Clients debloated with at least one test fail: "
## [1] 282
## [1] "#Clients original with zero test fail: "
## [1] 6768
## [1] "#Clients debloated with zero test fail: "
## [1] 6768
## [1] "#Clients debloated with zero test fail and at least one debloated test fail: "
## [1] 0
## [1] "#Clients debloated wigh zero test fail and at least one debloated test fail: "
## [1] 0
## [1] "#Clients original with at least one test pass: "
## [1] 2236
## [1] "#Clients debloated with at least one test pass: "
## [1] 1876
## [1] "#Clients original with zero test pass: "
## [1] 4814
## [1] "#Clients debloated with zero test pass: "
## [1] 5174
## [1] "#Clients debloated with zero test pass and at least one debloated test pass: "
## [1] 0
## [1] "#Clients debloated wigh zero test pass and at least one debloated test pass: "
## [1] 0

Violin plots

Bar plots